Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 2988181 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 456.0 MiB |
| Average record size in memory | 160.0 B |
Variable types
| Numeric | 15 |
|---|---|
| Categorical | 5 |
publisher_id has constant value "0" | Constant |
session_start has a high cardinality: 646874 distinct values | High cardinality |
click_timestamp has a high cardinality: 2983198 distinct values | High cardinality |
created_at_ts has a high cardinality: 45785 distinct values | High cardinality |
delta_timestamp has a high cardinality: 845906 distinct values | High cardinality |
session_end has a high cardinality: 1409401 distinct values | High cardinality |
Unnamed: 0 is highly correlated with session_id | High correlation |
session_id is highly correlated with Unnamed: 0 | High correlation |
article_id is highly correlated with category_id | High correlation |
category_id is highly correlated with article_id | High correlation |
Unnamed: 0 is highly correlated with session_id | High correlation |
session_id is highly correlated with Unnamed: 0 | High correlation |
article_id is highly correlated with category_id | High correlation |
click_deviceGroup is highly correlated with click_os | High correlation |
click_os is highly correlated with click_deviceGroup | High correlation |
category_id is highly correlated with article_id | High correlation |
Unnamed: 0 is highly correlated with session_id | High correlation |
session_id is highly correlated with Unnamed: 0 | High correlation |
article_id is highly correlated with category_id | High correlation |
category_id is highly correlated with article_id | High correlation |
Unnamed: 0 is highly correlated with session_id | High correlation |
user_id is highly correlated with session_id | High correlation |
session_id is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
article_id is highly correlated with category_id | High correlation |
click_deviceGroup is highly correlated with click_os | High correlation |
click_os is highly correlated with click_deviceGroup | High correlation |
click_country is highly correlated with click_region | High correlation |
click_region is highly correlated with click_country | High correlation |
category_id is highly correlated with article_id | High correlation |
Unnamed: 0 is uniformly distributed | Uniform |
click_timestamp is uniformly distributed | Uniform |
Unnamed: 0 has unique values | Unique |
publisher_id has 2988181 (100.0%) zeros | Zeros |
Reproduction
| Analysis started | 2022-10-02 19:52:30.641737 |
|---|---|
| Analysis finished | 2022-10-02 20:01:24.879740 |
| Duration | 8 minutes and 54.24 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
Unnamed: 0
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIFORMUNIQUE| Distinct | 2988181 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1494090 |
| Minimum | 0 |
|---|---|
| Maximum | 2988180 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 149409 |
| Q1 | 747045 |
| median | 1494090 |
| Q3 | 2241135 |
| 95-th percentile | 2838771 |
| Maximum | 2988180 |
| Range | 2988180 |
| Interquartile range (IQR) | 1494090 |
Descriptive statistics
| Standard deviation | 862613.6967 |
|---|---|
| Coefficient of variation (CV) | 0.577350559 |
| Kurtosis | -1.2 |
| Mean | 1494090 |
| Median Absolute Deviation (MAD) | 747045 |
| Skewness | -1.014664281 × 10-15 |
| Sum | 4.46461135 × 1012 |
| Variance | 7.441023897 × 1011 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 1992114 | 1 | < 0.1% |
| 1992116 | 1 | < 0.1% |
| 1992117 | 1 | < 0.1% |
| 1992118 | 1 | < 0.1% |
| 1992119 | 1 | < 0.1% |
| 1992120 | 1 | < 0.1% |
| 1992121 | 1 | < 0.1% |
| 1992122 | 1 | < 0.1% |
| 1992123 | 1 | < 0.1% |
| Other values (2988171) | 2988171 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 |
| Value | Count | Frequency (%) |
| 2988180 | 1 | |
| 2988179 | 1 | |
| 2988178 | 1 | |
| 2988177 | 1 | |
| 2988176 | 1 | |
| 2988175 | 1 | |
| 2988174 | 1 | |
| 2988173 | 1 | |
| 2988172 | 1 | |
| 2988171 | 1 |
| Distinct | 322897 |
|---|---|
| Distinct (%) | 10.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 107947.8258 |
| Minimum | 0 |
|---|---|
| Maximum | 322896 |
| Zeros | 8 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 6370 |
| Q1 | 40341 |
| median | 86229 |
| Q3 | 163261 |
| 95-th percentile | 274162 |
| Maximum | 322896 |
| Range | 322896 |
| Interquartile range (IQR) | 122920 |
Descriptive statistics
| Standard deviation | 83648.36147 |
|---|---|
| Coefficient of variation (CV) | 0.7748962136 |
| Kurtosis | -0.4686650537 |
| Mean | 107947.8258 |
| Median Absolute Deviation (MAD) | 57248 |
| Skewness | 0.7231189115 |
| Sum | 3.22567642 × 1011 |
| Variance | 6997048377 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5890 | 1232 | < 0.1% |
| 73574 | 939 | < 0.1% |
| 15867 | 900 | < 0.1% |
| 80350 | 783 | < 0.1% |
| 15275 | 746 | < 0.1% |
| 2151 | 722 | < 0.1% |
| 4568 | 529 | < 0.1% |
| 12897 | 513 | < 0.1% |
| 11521 | 502 | < 0.1% |
| 34541 | 501 | < 0.1% |
| Other values (322887) | 2980814 |
| Value | Count | Frequency (%) |
| 0 | 8 | < 0.1% |
| 1 | 12 | < 0.1% |
| 2 | 4 | < 0.1% |
| 3 | 17 | < 0.1% |
| 4 | 7 | < 0.1% |
| 5 | 87 | |
| 6 | 35 | |
| 7 | 22 | < 0.1% |
| 8 | 56 | |
| 9 | 4 | < 0.1% |
| Value | Count | Frequency (%) |
| 322896 | 2 | |
| 322895 | 2 | |
| 322894 | 2 | |
| 322893 | 2 | |
| 322892 | 2 | |
| 322891 | 2 | |
| 322890 | 2 | |
| 322889 | 2 | |
| 322888 | 2 | |
| 322887 | 3 |
| Distinct | 1048594 |
|---|---|
| Distinct (%) | 35.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.507472228 × 1015 |
| Minimum | 1.506825423 × 1015 |
|---|---|
| Maximum | 1.508211379 × 1015 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 1.506825423 × 1015 |
|---|---|
| 5-th percentile | 1.506941766 × 1015 |
| Q1 | 1.507124152 × 1015 |
| median | 1.50749334 × 1015 |
| Q3 | 1.507749414 × 1015 |
| 95-th percentile | 1.508153221 × 1015 |
| Maximum | 1.508211379 × 1015 |
| Range | 1.385955918 × 1012 |
| Interquartile range (IQR) | 6.252618534 × 1011 |
Descriptive statistics
| Standard deviation | 3.855244602 × 1011 |
|---|---|
| Coefficient of variation (CV) | 0.0002557423301 |
| Kurtosis | -1.111389169 |
| Mean | 1.507472228 × 1015 |
| Median Absolute Deviation (MAD) | 3.329949664 × 1011 |
| Skewness | 0.1807598817 |
| Sum | 3.594316782 × 1018 |
| Variance | 1.486291094 × 1023 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.507563658 × 1015 | 124 | < 0.1% |
| 1.507896573 × 1015 | 107 | < 0.1% |
| 1.507133568 × 1015 | 106 | < 0.1% |
| 1.507309773 × 1015 | 98 | < 0.1% |
| 1.508112331 × 1015 | 94 | < 0.1% |
| 1.507647366 × 1015 | 92 | < 0.1% |
| 1.507475404 × 1015 | 86 | < 0.1% |
| 1.506959499 × 1015 | 82 | < 0.1% |
| 1.508154737 × 1015 | 79 | < 0.1% |
| 1.506999909 × 1015 | 75 | < 0.1% |
| Other values (1048584) | 2987238 |
| Value | Count | Frequency (%) |
| 1.506825423 × 1015 | 2 | |
| 1.506825426 × 1015 | 2 | |
| 1.506825435 × 1015 | 2 | |
| 1.506825443 × 1015 | 2 | |
| 1.506825528 × 1015 | 2 | |
| 1.506825541 × 1015 | 3 | |
| 1.506825553 × 1015 | 2 | |
| 1.506825568 × 1015 | 2 | |
| 1.506825573 × 1015 | 3 | |
| 1.506825599 × 1015 | 2 |
| Value | Count | Frequency (%) |
| 1.508211379 × 1015 | 2 | < 0.1% |
| 1.508211376 × 1015 | 2 | < 0.1% |
| 1.508211372 × 1015 | 2 | < 0.1% |
| 1.508211369 × 1015 | 7 | |
| 1.508211367 × 1015 | 2 | < 0.1% |
| 1.508211353 × 1015 | 4 | |
| 1.508211348 × 1015 | 2 | < 0.1% |
| 1.508211326 × 1015 | 2 | < 0.1% |
| 1.508211326 × 1015 | 4 | |
| 1.508211324 × 1015 | 2 | < 0.1% |
| Distinct | 646874 |
|---|---|
| Distinct (%) | 21.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.8 MiB |
| 2017-10-09 15:40:57 | 127 |
|---|---|
| 2017-10-13 12:09:33 | 112 |
| 2017-10-04 16:12:47 | 108 |
| 2017-10-06 17:09:33 | 98 |
| 2017-10-10 14:56:06 | 97 |
| Other values (646869) |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Characters and Unicode
| Total characters | 56775439 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2017-10-01 02:37:03 |
|---|---|
| 2nd row | 2017-10-01 02:42:07 |
| 3rd row | 2017-10-01 02:48:59 |
| 4th row | 2017-10-01 02:49:02 |
| 5th row | 2017-10-01 02:54:23 |
Common Values
| Value | Count | Frequency (%) |
| 2017-10-09 15:40:57 | 127 | < 0.1% |
| 2017-10-13 12:09:33 | 112 | < 0.1% |
| 2017-10-04 16:12:47 | 108 | < 0.1% |
| 2017-10-06 17:09:33 | 98 | < 0.1% |
| 2017-10-10 14:56:06 | 97 | < 0.1% |
| 2017-10-16 00:05:31 | 96 | < 0.1% |
| 2017-10-10 16:05:43 | 87 | < 0.1% |
| 2017-10-02 15:51:39 | 87 | < 0.1% |
| 2017-10-08 15:10:03 | 86 | < 0.1% |
| 2017-10-16 11:52:17 | 85 | < 0.1% |
| Other values (646864) | 2987198 |
Length
| Value | Count | Frequency (%) |
| 2017-10-02 | 305709 | 5.1% |
| 2017-10-10 | 281384 | 4.7% |
| 2017-10-03 | 259709 | 4.3% |
| 2017-10-09 | 249856 | 4.2% |
| 2017-10-11 | 238521 | 4.0% |
| 2017-10-04 | 215267 | 3.6% |
| 2017-10-06 | 207537 | 3.5% |
| 2017-10-16 | 190891 | 3.2% |
| 2017-10-05 | 190074 | 3.2% |
| 2017-10-13 | 180599 | 3.0% |
| Other values (83818) | 3656815 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 11434946 | |
| 0 | 10649945 | |
| 2 | 5982278 | |
| - | 5976362 | |
| : | 5976362 | |
| 7 | 3949992 | 7.0% |
| 2988181 | 5.3% | |
| 3 | 2391815 | 4.2% |
| 4 | 2104829 | 3.7% |
| 5 | 2074593 | 3.7% |
| Other values (3) | 3246136 | 5.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 41834534 | |
| Dash Punctuation | 5976362 | 10.5% |
| Other Punctuation | 5976362 | 10.5% |
| Space Separator | 2988181 | 5.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 11434946 | |
| 0 | 10649945 | |
| 2 | 5982278 | |
| 7 | 3949992 | 9.4% |
| 3 | 2391815 | 5.7% |
| 4 | 2104829 | 5.0% |
| 5 | 2074593 | 5.0% |
| 6 | 1210846 | 2.9% |
| 9 | 1111614 | 2.7% |
| 8 | 923676 | 2.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 5976362 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 5976362 |
Space Separator
| Value | Count | Frequency (%) |
| 2988181 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 56775439 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 11434946 | |
| 0 | 10649945 | |
| 2 | 5982278 | |
| - | 5976362 | |
| : | 5976362 | |
| 7 | 3949992 | 7.0% |
| 2988181 | 5.3% | |
| 3 | 2391815 | 4.2% |
| 4 | 2104829 | 3.7% |
| 5 | 2074593 | 3.7% |
| Other values (3) | 3246136 | 5.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 56775439 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 11434946 | |
| 0 | 10649945 | |
| 2 | 5982278 | |
| - | 5976362 | |
| : | 5976362 | |
| 7 | 3949992 | 7.0% |
| 2988181 | 5.3% | |
| 3 | 2391815 | 4.2% |
| 4 | 2104829 | 3.7% |
| 5 | 2074593 | 3.7% |
| Other values (3) | 3246136 | 5.7% |
session_size
Real number (ℝ≥0)
| Distinct | 72 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.901885127 |
| Minimum | 2 |
|---|---|
| Maximum | 124 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 2 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 9 |
| Maximum | 124 |
| Range | 122 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 3.929941495 |
|---|---|
| Coefficient of variation (CV) | 1.007190465 |
| Kurtosis | 158.4608899 |
| Mean | 3.901885127 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 9.090074854 |
| Sum | 11659539 |
| Variance | 15.44444016 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 1260372 | |
| 3 | 670185 | |
| 4 | 374240 | 12.5% |
| 5 | 220105 | 7.4% |
| 6 | 135762 | 4.5% |
| 7 | 88354 | 3.0% |
| 8 | 58544 | 2.0% |
| 9 | 40878 | 1.4% |
| 10 | 29530 | 1.0% |
| 11 | 21714 | 0.7% |
| Other values (62) | 88497 | 3.0% |
| Value | Count | Frequency (%) |
| 2 | 1260372 | |
| 3 | 670185 | |
| 4 | 374240 | 12.5% |
| 5 | 220105 | 7.4% |
| 6 | 135762 | 4.5% |
| 7 | 88354 | 3.0% |
| 8 | 58544 | 2.0% |
| 9 | 40878 | 1.4% |
| 10 | 29530 | 1.0% |
| 11 | 21714 | 0.7% |
| Value | Count | Frequency (%) |
| 124 | 124 | |
| 107 | 107 | |
| 106 | 106 | |
| 98 | 98 | |
| 94 | 94 | |
| 92 | 92 | |
| 86 | 86 | |
| 82 | 82 | |
| 79 | 79 | |
| 75 | 75 |
| Distinct | 46033 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 194922.6487 |
| Minimum | 3 |
|---|---|
| Maximum | 364046 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 42223 |
| Q1 | 124228 |
| median | 202381 |
| Q3 | 277067 |
| 95-th percentile | 336254 |
| Maximum | 364046 |
| Range | 364043 |
| Interquartile range (IQR) | 152839 |
Descriptive statistics
| Standard deviation | 90768.42147 |
|---|---|
| Coefficient of variation (CV) | 0.4656638009 |
| Kurtosis | -0.943045904 |
| Mean | 194922.6487 |
| Median Absolute Deviation (MAD) | 77632 |
| Skewness | -0.1234365434 |
| Sum | 5.824641553 × 1011 |
| Variance | 8238906336 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 160974 | 37213 | 1.2% |
| 272143 | 28943 | 1.0% |
| 336221 | 23851 | 0.8% |
| 234698 | 23499 | 0.8% |
| 123909 | 23122 | 0.8% |
| 336223 | 21855 | 0.7% |
| 96210 | 21577 | 0.7% |
| 162655 | 21062 | 0.7% |
| 183176 | 20303 | 0.7% |
| 168623 | 19526 | 0.7% |
| Other values (46023) | 2747230 |
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 27 | 1 | |
| 69 | 1 | |
| 81 | 2 | |
| 84 | 1 | |
| 94 | 2 | |
| 115 | 2 | |
| 125 | 1 | |
| 137 | 1 | |
| 139 | 1 |
| Value | Count | Frequency (%) |
| 364046 | 2 | < 0.1% |
| 364043 | 8 | < 0.1% |
| 364028 | 1 | < 0.1% |
| 364022 | 1 | < 0.1% |
| 364017 | 22 | |
| 364015 | 1 | < 0.1% |
| 364014 | 1 | < 0.1% |
| 364013 | 1 | < 0.1% |
| 364012 | 1 | < 0.1% |
| 364001 | 4 | < 0.1% |
| Distinct | 2983198 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.8 MiB |
| 2017-10-03 17:40:48.643 | 3 |
|---|---|
| 2017-10-02 16:16:49.961 | 3 |
| 2017-10-02 14:54:37.261 | 3 |
| 2017-10-13 14:39:48.690 | 3 |
| 2017-10-06 20:07:23.928 | 3 |
| Other values (2983193) |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 68728163 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2978224 ? |
|---|---|
| Unique (%) | 99.7% |
Sample
| 1st row | 2017-10-01 03:00:28.020 |
|---|---|
| 2nd row | 2017-10-01 05:42:28.634 |
| 3rd row | 2017-10-01 11:27:58.141 |
| 4th row | 2017-10-01 03:08:29.970 |
| 5th row | 2017-10-01 03:33:43.469 |
Common Values
| Value | Count | Frequency (%) |
| 2017-10-03 17:40:48.643 | 3 | < 0.1% |
| 2017-10-02 16:16:49.961 | 3 | < 0.1% |
| 2017-10-02 14:54:37.261 | 3 | < 0.1% |
| 2017-10-13 14:39:48.690 | 3 | < 0.1% |
| 2017-10-06 20:07:23.928 | 3 | < 0.1% |
| 2017-10-09 13:01:34.045 | 3 | < 0.1% |
| 2017-10-02 20:16:02.256 | 3 | < 0.1% |
| 2017-10-16 14:42:54.899 | 3 | < 0.1% |
| 2017-10-14 12:28:25.656 | 3 | < 0.1% |
| 2017-10-10 01:08:21.540 | 2 | < 0.1% |
| Other values (2983188) | 2988152 |
Length
| Value | Count | Frequency (%) |
| 2017-10-02 | 303177 | 5.1% |
| 2017-10-10 | 282391 | 4.7% |
| 2017-10-03 | 261159 | 4.4% |
| 2017-10-09 | 248208 | 4.2% |
| 2017-10-11 | 238969 | 4.0% |
| 2017-10-04 | 215415 | 3.6% |
| 2017-10-06 | 207646 | 3.5% |
| 2017-10-05 | 190003 | 3.2% |
| 2017-10-16 | 189779 | 3.2% |
| 2017-10-13 | 180723 | 3.0% |
| Other values (2923727) | 3658892 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 12265879 | |
| 0 | 11533489 | |
| 2 | 6914494 | |
| - | 5976362 | |
| : | 5976362 | |
| 7 | 4849332 | 7.1% |
| 3 | 3299611 | 4.8% |
| 4 | 3017099 | 4.4% |
| 2988181 | 4.3% | |
| . | 2988181 | 4.3% |
| Other values (4) | 8919173 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 50799077 | |
| Other Punctuation | 8964543 | 13.0% |
| Dash Punctuation | 5976362 | 8.7% |
| Space Separator | 2988181 | 4.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 12265879 | |
| 0 | 11533489 | |
| 2 | 6914494 | |
| 7 | 4849332 | 9.5% |
| 3 | 3299611 | 6.5% |
| 4 | 3017099 | 5.9% |
| 5 | 2978574 | 5.9% |
| 6 | 2106971 | 4.1% |
| 9 | 2008543 | 4.0% |
| 8 | 1825085 | 3.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 5976362 | |
| . | 2988181 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 5976362 |
Space Separator
| Value | Count | Frequency (%) |
| 2988181 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 68728163 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 12265879 | |
| 0 | 11533489 | |
| 2 | 6914494 | |
| - | 5976362 | |
| : | 5976362 | |
| 7 | 4849332 | 7.1% |
| 3 | 3299611 | 4.8% |
| 4 | 3017099 | 4.4% |
| 2988181 | 4.3% | |
| . | 2988181 | 4.3% |
| Other values (4) | 8919173 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 68728163 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 12265879 | |
| 0 | 11533489 | |
| 2 | 6914494 | |
| - | 5976362 | |
| : | 5976362 | |
| 7 | 4849332 | 7.1% |
| 3 | 3299611 | 4.8% |
| 4 | 3017099 | 4.4% |
| 2988181 | 4.3% | |
| . | 2988181 | 4.3% |
| Other values (4) | 8919173 |
click_environment
Real number (ℝ≥0)
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.942652068 |
| Minimum | 1 |
|---|---|
| Maximum | 4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 4 |
| median | 4 |
| Q3 | 4 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 3 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.339680408 |
|---|---|
| Coefficient of variation (CV) | 0.0861553092 |
| Kurtosis | 33.01323632 |
| Mean | 3.942652068 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -5.848728196 |
| Sum | 11781358 |
| Variance | 0.1153827796 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 2904478 | |
| 2 | 79743 | 2.7% |
| 1 | 3960 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 3960 | 0.1% |
| 2 | 79743 | 2.7% |
| 4 | 2904478 |
| Value | Count | Frequency (%) |
| 4 | 2904478 | |
| 2 | 79743 | 2.7% |
| 1 | 3960 | 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.819305792 |
| Minimum | 1 |
|---|---|
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 3 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.042213782 |
|---|---|
| Coefficient of variation (CV) | 0.5728634442 |
| Kurtosis | -1.427040365 |
| Mean | 1.819305792 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.5763858618 |
| Sum | 5436415 |
| Variance | 1.086209567 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 1823162 | |
| 3 | 1047086 | |
| 4 | 117640 | 3.9% |
| 5 | 283 | < 0.1% |
| 2 | 10 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 1823162 | |
| 2 | 10 | < 0.1% |
| 3 | 1047086 | |
| 4 | 117640 | 3.9% |
| 5 | 283 | < 0.1% |
| Value | Count | Frequency (%) |
| 5 | 283 | < 0.1% |
| 4 | 117640 | 3.9% |
| 3 | 1047086 | |
| 2 | 10 | < 0.1% |
| 1 | 1823162 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.27760333 |
| Minimum | 2 |
|---|---|
| Maximum | 20 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 2 |
| median | 17 |
| Q3 | 17 |
| 95-th percentile | 20 |
| Maximum | 20 |
| Range | 18 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 6.881718417 |
|---|---|
| Coefficient of variation (CV) | 0.5182952258 |
| Kurtosis | -0.9317514661 |
| Mean | 13.27760333 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -0.9541171292 |
| Sum | 39675882 |
| Variance | 47.35804837 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 17 | 1738138 | |
| 2 | 788699 | |
| 20 | 369586 | 12.4% |
| 12 | 60096 | 2.0% |
| 13 | 23711 | 0.8% |
| 19 | 6384 | 0.2% |
| 5 | 1513 | 0.1% |
| 3 | 54 | < 0.1% |
| Value | Count | Frequency (%) |
| 2 | 788699 | |
| 3 | 54 | < 0.1% |
| 5 | 1513 | 0.1% |
| 12 | 60096 | 2.0% |
| 13 | 23711 | 0.8% |
| 17 | 1738138 | |
| 19 | 6384 | 0.2% |
| 20 | 369586 | 12.4% |
| Value | Count | Frequency (%) |
| 20 | 369586 | 12.4% |
| 19 | 6384 | 0.2% |
| 17 | 1738138 | |
| 13 | 23711 | 0.8% |
| 12 | 60096 | 2.0% |
| 5 | 1513 | 0.1% |
| 3 | 54 | < 0.1% |
| 2 | 788699 |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.357656046 |
| Minimum | 1 |
|---|---|
| Maximum | 11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 11 |
| Range | 10 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.725860976 |
|---|---|
| Coefficient of variation (CV) | 1.271206342 |
| Kurtosis | 21.55275991 |
| Mean | 1.357656046 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.802252338 |
| Sum | 4056922 |
| Variance | 2.978596109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 2852406 | |
| 10 | 61377 | 2.1% |
| 11 | 29999 | 1.0% |
| 8 | 9556 | 0.3% |
| 6 | 7256 | 0.2% |
| 9 | 6746 | 0.2% |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.2% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
| Value | Count | Frequency (%) |
| 1 | 2852406 | |
| 2 | 6101 | 0.2% |
| 3 | 4540 | 0.2% |
| 4 | 3389 | 0.1% |
| 5 | 3498 | 0.1% |
| 6 | 7256 | 0.2% |
| 7 | 3313 | 0.1% |
| 8 | 9556 | 0.3% |
| 9 | 6746 | 0.2% |
| 10 | 61377 | 2.1% |
| Value | Count | Frequency (%) |
| 11 | 29999 | |
| 10 | 61377 | |
| 9 | 6746 | 0.2% |
| 8 | 9556 | 0.3% |
| 7 | 3313 | 0.1% |
| 6 | 7256 | 0.2% |
| 5 | 3498 | 0.1% |
| 4 | 3389 | 0.1% |
| 3 | 4540 | 0.2% |
| 2 | 6101 | 0.2% |
| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.31331435 |
| Minimum | 1 |
|---|---|
| Maximum | 28 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 13 |
| median | 21 |
| Q3 | 25 |
| 95-th percentile | 27 |
| Maximum | 28 |
| Range | 27 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 7.064006436 |
|---|---|
| Coefficient of variation (CV) | 0.3857306383 |
| Kurtosis | -0.9755078164 |
| Mean | 18.31331435 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -0.545880017 |
| Sum | 54723498 |
| Variance | 49.90018693 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 25 | 804985 | |
| 21 | 464230 | |
| 13 | 320957 | 10.7% |
| 8 | 179339 | 6.0% |
| 16 | 164884 | 5.5% |
| 28 | 135793 | 4.5% |
| 24 | 130537 | 4.4% |
| 20 | 120884 | 4.0% |
| 5 | 96979 | 3.2% |
| 9 | 84693 | 2.8% |
| Other values (18) | 484900 |
| Value | Count | Frequency (%) |
| 1 | 7110 | 0.2% |
| 2 | 16728 | 0.6% |
| 3 | 3997 | 0.1% |
| 4 | 30265 | 1.0% |
| 5 | 96979 | |
| 6 | 57254 | 1.9% |
| 7 | 64062 | 2.1% |
| 8 | 179339 | |
| 9 | 84693 | |
| 10 | 21995 | 0.7% |
| Value | Count | Frequency (%) |
| 28 | 135793 | 4.5% |
| 27 | 18711 | 0.6% |
| 26 | 18893 | 0.6% |
| 25 | 804985 | |
| 24 | 130537 | 4.4% |
| 23 | 43 | < 0.1% |
| 22 | 13101 | 0.4% |
| 21 | 464230 | |
| 20 | 120884 | 4.0% |
| 19 | 34092 | 1.1% |
click_referrer_type
Real number (ℝ≥0)
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.838981307 |
| Minimum | 1 |
|---|---|
| Maximum | 7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 7 |
| Range | 6 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.15635571 |
|---|---|
| Coefficient of variation (CV) | 0.628802319 |
| Kurtosis | 9.117533472 |
| Mean | 1.838981307 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.83996653 |
| Sum | 5495209 |
| Variance | 1.337158529 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 1602601 | |
| 1 | 1194321 | |
| 5 | 80766 | 2.7% |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 1194321 | |
| 2 | 1602601 | |
| 3 | 420 | < 0.1% |
| 4 | 19820 | 0.7% |
| 5 | 80766 | 2.7% |
| 6 | 20455 | 0.7% |
| 7 | 69798 | 2.3% |
| Value | Count | Frequency (%) |
| 7 | 69798 | 2.3% |
| 6 | 20455 | 0.7% |
| 5 | 80766 | 2.7% |
| 4 | 19820 | 0.7% |
| 3 | 420 | < 0.1% |
| 2 | 1602601 | |
| 1 | 1194321 |
| Distinct | 316 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 305.9382404 |
| Minimum | 1 |
|---|---|
| Maximum | 460 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 67 |
| Q1 | 250 |
| median | 327 |
| Q3 | 409 |
| 95-th percentile | 437 |
| Maximum | 460 |
| Range | 459 |
| Interquartile range (IQR) | 159 |
Descriptive statistics
| Standard deviation | 113.0805459 |
|---|---|
| Coefficient of variation (CV) | 0.3696188674 |
| Kurtosis | 0.1095423476 |
| Mean | 305.9382404 |
| Median Absolute Deviation (MAD) | 77 |
| Skewness | -0.8877669712 |
| Sum | 914198837 |
| Variance | 12787.20986 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 281 | 370843 | 12.4% |
| 375 | 268257 | 9.0% |
| 412 | 178894 | 6.0% |
| 437 | 157085 | 5.3% |
| 250 | 140454 | 4.7% |
| 331 | 115901 | 3.9% |
| 399 | 104464 | 3.5% |
| 209 | 83750 | 2.8% |
| 418 | 67119 | 2.2% |
| 118 | 64216 | 2.1% |
| Other values (306) | 1437198 |
| Value | Count | Frequency (%) |
| 1 | 6107 | 0.2% |
| 2 | 3742 | 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 2856 | 0.1% |
| 6 | 9971 | |
| 7 | 19898 | |
| 9 | 15470 | |
| 11 | 2 | < 0.1% |
| 15 | 48 | < 0.1% |
| 16 | 135 | < 0.1% |
| Value | Count | Frequency (%) |
| 460 | 12 | < 0.1% |
| 458 | 1230 | < 0.1% |
| 456 | 11 | < 0.1% |
| 455 | 11042 | |
| 454 | 102 | < 0.1% |
| 453 | 5 | < 0.1% |
| 451 | 2 | < 0.1% |
| 450 | 4830 | |
| 449 | 3 | < 0.1% |
| 448 | 4436 |
| Distinct | 45785 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.8 MiB |
| 2017-10-02 02:52:27 | 37213 |
|---|---|
| 2017-10-02 16:31:10 | 28943 |
| 2017-10-10 05:26:01 | 23851 |
| 2017-10-10 06:56:37 | 23499 |
| 2017-10-05 10:22:35 | 23122 |
| Other values (45780) |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Characters and Unicode
| Total characters | 56775439 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 24666 ? |
|---|---|
| Unique (%) | 0.8% |
Sample
| 1st row | 2017-09-30 19:41:58 |
|---|---|
| 2nd row | 2017-09-30 19:41:58 |
| 3rd row | 2017-09-30 19:41:58 |
| 4th row | 2017-09-30 19:41:58 |
| 5th row | 2017-09-30 19:41:58 |
Common Values
| Value | Count | Frequency (%) |
| 2017-10-02 02:52:27 | 37213 | 1.2% |
| 2017-10-02 16:31:10 | 28943 | 1.0% |
| 2017-10-10 05:26:01 | 23851 | 0.8% |
| 2017-10-10 06:56:37 | 23499 | 0.8% |
| 2017-10-05 10:22:35 | 23122 | 0.8% |
| 2017-10-09 13:28:20 | 21855 | 0.7% |
| 2017-10-12 08:59:51 | 21577 | 0.7% |
| 2017-10-02 13:06:50 | 21062 | 0.7% |
| 2017-10-11 14:18:49 | 20303 | 0.7% |
| 2017-10-04 19:07:30 | 19526 | 0.7% |
| Other values (45775) | 2747230 |
Length
| Value | Count | Frequency (%) |
| 2017-10-02 | 354788 | 5.9% |
| 2017-10-10 | 270884 | 4.5% |
| 2017-10-09 | 261821 | 4.4% |
| 2017-10-05 | 218982 | 3.7% |
| 2017-10-11 | 218099 | 3.6% |
| 2017-10-04 | 211749 | 3.5% |
| 2017-10-06 | 197795 | 3.3% |
| 2017-10-03 | 179220 | 3.0% |
| 2017-10-13 | 148735 | 2.5% |
| 2017-10-16 | 143295 | 2.4% |
| Other values (32977) | 3770994 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 11307282 | |
| 1 | 11162604 | |
| - | 5976362 | |
| : | 5976362 | |
| 2 | 5650894 | |
| 7 | 4046248 | 7.1% |
| 2988181 | 5.3% | |
| 5 | 2245202 | 4.0% |
| 3 | 1978325 | 3.5% |
| 4 | 1844803 | 3.2% |
| Other values (3) | 3599176 | 6.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 41834534 | |
| Dash Punctuation | 5976362 | 10.5% |
| Other Punctuation | 5976362 | 10.5% |
| Space Separator | 2988181 | 5.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 11307282 | |
| 1 | 11162604 | |
| 2 | 5650894 | |
| 7 | 4046248 | 9.7% |
| 5 | 2245202 | 5.4% |
| 3 | 1978325 | 4.7% |
| 4 | 1844803 | 4.4% |
| 6 | 1315854 | 3.1% |
| 9 | 1201572 | 2.9% |
| 8 | 1081750 | 2.6% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 5976362 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 5976362 |
Space Separator
| Value | Count | Frequency (%) |
| 2988181 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 56775439 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 11307282 | |
| 1 | 11162604 | |
| - | 5976362 | |
| : | 5976362 | |
| 2 | 5650894 | |
| 7 | 4046248 | 7.1% |
| 2988181 | 5.3% | |
| 5 | 2245202 | 4.0% |
| 3 | 1978325 | 3.5% |
| 4 | 1844803 | 3.2% |
| Other values (3) | 3599176 | 6.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 56775439 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 11307282 | |
| 1 | 11162604 | |
| - | 5976362 | |
| : | 5976362 | |
| 2 | 5650894 | |
| 7 | 4046248 | 7.1% |
| 2988181 | 5.3% | |
| 5 | 2245202 | 4.0% |
| 3 | 1978325 | 3.5% |
| 4 | 1844803 | 3.2% |
| Other values (3) | 3599176 | 6.3% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0 |
| Minimum | 0 |
|---|---|
| Maximum | 0 |
| Zeros | 2988181 |
| Zeros (%) | 100.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 0 |
| Range | 0 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0 |
|---|---|
| Coefficient of variation (CV) | nan |
| Kurtosis | 0 |
| Mean | 0 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0 |
| Sum | 0 |
| Variance | 0 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 0 | 2988181 |
| Value | Count | Frequency (%) |
| 0 | 2988181 |
| Value | Count | Frequency (%) |
| 0 | 2988181 |
words_count
Real number (ℝ≥0)
| Distinct | 536 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 208.6283381 |
| Minimum | 0 |
|---|---|
| Maximum | 6690 |
| Zeros | 65 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 136 |
| Q1 | 173 |
| median | 198 |
| Q3 | 232 |
| 95-th percentile | 284 |
| Maximum | 6690 |
| Range | 6690 |
| Interquartile range (IQR) | 59 |
Descriptive statistics
| Standard deviation | 81.60152023 |
|---|---|
| Coefficient of variation (CV) | 0.3911334432 |
| Kurtosis | 79.67854049 |
| Mean | 208.6283381 |
| Median Absolute Deviation (MAD) | 28 |
| Skewness | 6.55037723 |
| Sum | 623419236 |
| Variance | 6658.808105 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 184 | 62994 | 2.1% |
| 158 | 60457 | 2.0% |
| 197 | 58150 | 1.9% |
| 210 | 56408 | 1.9% |
| 259 | 49572 | 1.7% |
| 183 | 44730 | 1.5% |
| 199 | 42994 | 1.4% |
| 205 | 42918 | 1.4% |
| 198 | 40228 | 1.3% |
| 220 | 39454 | 1.3% |
| Other values (526) | 2490276 |
| Value | Count | Frequency (%) |
| 0 | 65 | < 0.1% |
| 5 | 1 | < 0.1% |
| 7 | 11 | < 0.1% |
| 8 | 137 | < 0.1% |
| 10 | 559 | |
| 11 | 9 | < 0.1% |
| 12 | 19 | < 0.1% |
| 13 | 3 | < 0.1% |
| 14 | 1279 | |
| 15 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 6690 | 1 | < 0.1% |
| 3808 | 7 | |
| 3082 | 5 | |
| 2899 | 1 | < 0.1% |
| 2743 | 4 | |
| 1764 | 8 | |
| 1676 | 2 | < 0.1% |
| 1635 | 3 | < 0.1% |
| 1626 | 2 | < 0.1% |
| 1606 | 1 | < 0.1% |
| Distinct | 845906 |
|---|---|
| Distinct (%) | 28.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.8 MiB |
| 0 days 00:00:00 | |
|---|---|
| 0 days 00:00:30 | |
| -1 days +23:59:30 | |
| 0 days 00:00:00.001000 | 11 |
| 0 days 00:00:00.052000 | 11 |
| Other values (845901) |
Length
| Max length | 25 |
|---|---|
| Median length | 15 |
| Mean length | 18.12865084 |
| Min length | 15 |
Characters and Unicode
| Total characters | 54171690 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 674100 ? |
|---|---|
| Unique (%) | 22.6% |
Sample
| 1st row | 0 days 00:00:00 |
|---|---|
| 2nd row | 0 days 00:00:00 |
| 3rd row | 0 days 00:00:00 |
| 4th row | 0 days 00:00:00 |
| 5th row | 0 days 00:00:00 |
Common Values
| Value | Count | Frequency (%) |
| 0 days 00:00:00 | 1048606 | |
| 0 days 00:00:30 | 457210 | |
| -1 days +23:59:30 | 406846 | 13.6% |
| 0 days 00:00:00.001000 | 11 | < 0.1% |
| 0 days 00:00:00.052000 | 11 | < 0.1% |
| 0 days 00:00:00.061000 | 10 | < 0.1% |
| 0 days 00:00:00.167000 | 10 | < 0.1% |
| 0 days 00:00:00.023000 | 10 | < 0.1% |
| -1 days +23:59:59.965000 | 10 | < 0.1% |
| 0 days 00:00:00.002000 | 10 | < 0.1% |
| Other values (845896) | 1075447 |
Length
| Value | Count | Frequency (%) |
| days | 2988181 | |
| 0 | 2072584 | |
| 00:00:00 | 1048606 | 11.7% |
| 1 | 913777 | 10.2% |
| 00:00:30 | 457210 | 5.1% |
| 23:59:30 | 406846 | 4.5% |
| 2 | 1033 | < 0.1% |
| 3 | 371 | < 0.1% |
| 4 | 171 | < 0.1% |
| 5 | 118 | < 0.1% |
| Other values (845766) | 1075646 | 12.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 16508571 | |
| 5976362 | 11.0% | |
| : | 5976362 | 11.0% |
| d | 2988181 | 5.5% |
| a | 2988181 | 5.5% |
| y | 2988181 | 5.5% |
| s | 2988181 | 5.5% |
| 3 | 2512041 | 4.6% |
| 1 | 1775907 | 3.3% |
| 2 | 1717247 | 3.2% |
| Other values (9) | 7752476 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 27363959 | |
| Lowercase Letter | 11952724 | |
| Other Punctuation | 7050807 | 13.0% |
| Space Separator | 5976362 | 11.0% |
| Dash Punctuation | 913919 | 1.7% |
| Math Symbol | 913919 | 1.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 16508571 | |
| 3 | 2512041 | 9.2% |
| 1 | 1775907 | 6.5% |
| 2 | 1717247 | 6.3% |
| 5 | 1481205 | 5.4% |
| 9 | 957894 | 3.5% |
| 4 | 792265 | 2.9% |
| 8 | 550630 | 2.0% |
| 7 | 539231 | 2.0% |
| 6 | 528968 | 1.9% |
Lowercase Letter
| Value | Count | Frequency (%) |
| d | 2988181 | |
| a | 2988181 | |
| y | 2988181 | |
| s | 2988181 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 5976362 | |
| . | 1074445 | 15.2% |
Space Separator
| Value | Count | Frequency (%) |
| 5976362 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 913919 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 913919 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 42218966 | |
| Latin | 11952724 | 22.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 16508571 | |
| 5976362 | 14.2% | |
| : | 5976362 | 14.2% |
| 3 | 2512041 | 6.0% |
| 1 | 1775907 | 4.2% |
| 2 | 1717247 | 4.1% |
| 5 | 1481205 | 3.5% |
| . | 1074445 | 2.5% |
| 9 | 957894 | 2.3% |
| - | 913919 | 2.2% |
| Other values (5) | 3325013 | 7.9% |
Latin
| Value | Count | Frequency (%) |
| d | 2988181 | |
| a | 2988181 | |
| y | 2988181 | |
| s | 2988181 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 54171690 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 16508571 | |
| 5976362 | 11.0% | |
| : | 5976362 | 11.0% |
| d | 2988181 | 5.5% |
| a | 2988181 | 5.5% |
| y | 2988181 | 5.5% |
| s | 2988181 | 5.5% |
| 3 | 2512041 | 4.6% |
| 1 | 1775907 | 3.3% |
| 2 | 1717247 | 3.2% |
| Other values (9) | 7752476 |
| Distinct | 1409401 |
|---|---|
| Distinct (%) | 47.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 22.8 MiB |
| 0 days 00:00:00.826000 | 62 |
|---|---|
| 0 days 00:00:00.359000 | 62 |
| 0 days 00:00:00.837000 | 62 |
| 0 days 00:00:00.973000 | 61 |
| 0 days 00:00:00.767000 | 60 |
| Other values (1409396) |
Length
| Max length | 23 |
|---|---|
| Median length | 22 |
| Mean length | 21.99319218 |
| Min length | 15 |
Characters and Unicode
| Total characters | 65719639 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 914347 ? |
|---|---|
| Unique (%) | 30.6% |
Sample
| 1st row | 0 days 00:23:25.020000 |
|---|---|
| 2nd row | 0 days 03:00:21.634000 |
| 3rd row | 0 days 08:38:59.141000 |
| 4th row | 0 days 00:19:27.970000 |
| 5th row | 0 days 00:39:20.469000 |
Common Values
| Value | Count | Frequency (%) |
| 0 days 00:00:00.826000 | 62 | < 0.1% |
| 0 days 00:00:00.359000 | 62 | < 0.1% |
| 0 days 00:00:00.837000 | 62 | < 0.1% |
| 0 days 00:00:00.973000 | 61 | < 0.1% |
| 0 days 00:00:00.767000 | 60 | < 0.1% |
| 0 days 00:00:00.204000 | 58 | < 0.1% |
| 0 days 00:00:00.154000 | 57 | < 0.1% |
| 0 days 00:00:00.510000 | 57 | < 0.1% |
| 0 days 00:00:30.977000 | 57 | < 0.1% |
| 0 days 00:00:00.863000 | 57 | < 0.1% |
| Other values (1409391) | 2987588 |
Length
| Value | Count | Frequency (%) |
| days | 2988181 | |
| 0 | 2980850 | |
| 1 | 3840 | < 0.1% |
| 2 | 1455 | < 0.1% |
| 3 | 759 | < 0.1% |
| 4 | 419 | < 0.1% |
| 5 | 227 | < 0.1% |
| 6 | 197 | < 0.1% |
| 7 | 130 | < 0.1% |
| 00:00:00.826000 | 62 | < 0.1% |
| Other values (1408991) | 2988423 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 22149451 | |
| 5976362 | 9.1% | |
| : | 5976362 | 9.1% |
| d | 2988181 | 4.5% |
| a | 2988181 | 4.5% |
| y | 2988181 | 4.5% |
| s | 2988181 | 4.5% |
| . | 2985246 | 4.5% |
| 1 | 2773841 | 4.2% |
| 2 | 2321243 | 3.5% |
| Other values (7) | 11584410 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 38828945 | |
| Lowercase Letter | 11952724 | 18.2% |
| Other Punctuation | 8961608 | 13.6% |
| Space Separator | 5976362 | 9.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 22149451 | |
| 1 | 2773841 | 7.1% |
| 2 | 2321243 | 6.0% |
| 3 | 2167542 | 5.6% |
| 4 | 2034920 | 5.2% |
| 5 | 1957277 | 5.0% |
| 6 | 1383712 | 3.6% |
| 7 | 1361805 | 3.5% |
| 8 | 1345390 | 3.5% |
| 9 | 1333764 | 3.4% |
Lowercase Letter
| Value | Count | Frequency (%) |
| d | 2988181 | |
| a | 2988181 | |
| y | 2988181 | |
| s | 2988181 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 5976362 | |
| . | 2985246 |
Space Separator
| Value | Count | Frequency (%) |
| 5976362 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 53766915 | |
| Latin | 11952724 | 18.2% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 22149451 | |
| 5976362 | 11.1% | |
| : | 5976362 | 11.1% |
| . | 2985246 | 5.6% |
| 1 | 2773841 | 5.2% |
| 2 | 2321243 | 4.3% |
| 3 | 2167542 | 4.0% |
| 4 | 2034920 | 3.8% |
| 5 | 1957277 | 3.6% |
| 6 | 1383712 | 2.6% |
| Other values (3) | 4040959 | 7.5% |
Latin
| Value | Count | Frequency (%) |
| d | 2988181 | |
| a | 2988181 | |
| y | 2988181 | |
| s | 2988181 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 65719639 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 22149451 | |
| 5976362 | 9.1% | |
| : | 5976362 | 9.1% |
| d | 2988181 | 4.5% |
| a | 2988181 | 4.5% |
| y | 2988181 | 4.5% |
| s | 2988181 | 4.5% |
| . | 2985246 | 4.5% |
| 1 | 2773841 | 4.2% |
| 2 | 2321243 | 3.5% |
| Other values (7) | 11584410 |
session_time
Real number (ℝ)
| Distinct | 52665 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 496.4351383 |
| Minimum | -1607899 |
|---|---|
| Maximum | 1804645 |
| Zeros | 9270 |
| Zeros (%) | 0.3% |
| Negative | 912713 |
| Negative (%) | 30.5% |
| Memory size | 22.8 MiB |
Quantile statistics
| Minimum | -1607899 |
|---|---|
| 5-th percentile | -677 |
| Q1 | -30 |
| median | 30 |
| Q3 | 266 |
| 95-th percentile | 2520 |
| Maximum | 1804645 |
| Range | 3412544 |
| Interquartile range (IQR) | 296 |
Descriptive statistics
| Standard deviation | 9565.460867 |
|---|---|
| Coefficient of variation (CV) | 19.26829938 |
| Kurtosis | 4064.534365 |
| Mean | 496.4351383 |
| Median Absolute Deviation (MAD) | 89 |
| Skewness | 19.23172964 |
| Sum | 1483438048 |
| Variance | 91498041.6 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 30 | 468396 | 15.7% |
| -30 | 408265 | 13.7% |
| 31 | 11980 | 0.4% |
| 1 | 11952 | 0.4% |
| 0 | 9270 | 0.3% |
| 43 | 6277 | 0.2% |
| 42 | 6259 | 0.2% |
| 46 | 6254 | 0.2% |
| 45 | 6209 | 0.2% |
| 44 | 6200 | 0.2% |
| Other values (52655) | 2047119 |
| Value | Count | Frequency (%) |
| -1607899 | 1 | |
| -1301610 | 1 | |
| -1289919 | 1 | |
| -1226026 | 1 | |
| -1219328 | 1 | |
| -1208454 | 1 | |
| -883373 | 1 | |
| -879690 | 1 | |
| -874539 | 1 | |
| -783333 | 1 |
| Value | Count | Frequency (%) |
| 1804645 | 1 | |
| 1659606 | 1 | |
| 1368301 | 1 | |
| 1320957 | 1 | |
| 1280823 | 1 | |
| 1220330 | 1 | |
| 1218263 | 1 | |
| 1212149 | 1 | |
| 1138682 | 1 | |
| 1121327 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| Unnamed: 0 | user_id | session_id | session_start | session_size | article_id | click_timestamp | click_environment | click_deviceGroup | click_os | click_country | click_region | click_referrer_type | category_id | created_at_ts | publisher_id | words_count | delta_timestamp | session_end | session_time | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1506825423271737 | 2017-10-01 02:37:03 | 2 | 157541 | 2017-10-01 03:00:28.020 | 4 | 3 | 20 | 1 | 20 | 2 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:23:25.020000 | 1405.0 |
| 1 | 1 | 20 | 1506825727279757 | 2017-10-01 02:42:07 | 2 | 157541 | 2017-10-01 05:42:28.634 | 4 | 1 | 17 | 1 | 9 | 1 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 03:00:21.634000 | 10822.0 |
| 2 | 2 | 44 | 1506826139185781 | 2017-10-01 02:48:59 | 5 | 157541 | 2017-10-01 11:27:58.141 | 4 | 1 | 17 | 1 | 12 | 1 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 08:38:59.141000 | 31139.0 |
| 3 | 3 | 45 | 1506826142324782 | 2017-10-01 02:49:02 | 2 | 157541 | 2017-10-01 03:08:29.970 | 4 | 1 | 17 | 1 | 17 | 1 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:19:27.970000 | 1168.0 |
| 4 | 4 | 76 | 1506826463226813 | 2017-10-01 02:54:23 | 2 | 157541 | 2017-10-01 03:33:43.469 | 4 | 3 | 2 | 1 | 21 | 1 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:39:20.469000 | 2360.0 |
| 5 | 5 | 81 | 1506826491101818 | 2017-10-01 02:54:51 | 2 | 157541 | 2017-10-01 03:47:31.734 | 4 | 1 | 17 | 1 | 21 | 1 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:52:40.734000 | 3161.0 |
| 6 | 6 | 121 | 1506826653149858 | 2017-10-01 02:57:33 | 3 | 157541 | 2017-10-01 03:06:43.302 | 2 | 3 | 20 | 10 | 28 | 2 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:09:10.302000 | 550.0 |
| 7 | 7 | 143 | 1506826753261880 | 2017-10-01 02:59:13 | 3 | 157541 | 2017-10-01 03:05:30.024 | 4 | 1 | 12 | 1 | 21 | 2 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:06:17.024000 | 377.0 |
| 8 | 8 | 153 | 1506826788896890 | 2017-10-01 02:59:48 | 3 | 157541 | 2017-10-01 03:00:47.697 | 4 | 3 | 20 | 1 | 24 | 2 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:00:59.697000 | 60.0 |
| 9 | 9 | 155 | 1506826793302892 | 2017-10-01 02:59:53 | 2 | 157541 | 2017-10-01 03:02:56.135 | 4 | 3 | 20 | 1 | 25 | 2 | 281 | 2017-09-30 19:41:58 | 0 | 280 | 0 days 00:00:00 | 0 days 00:03:03.135000 | 183.0 |
Last rows
| Unnamed: 0 | user_id | session_id | session_start | session_size | article_id | click_timestamp | click_environment | click_deviceGroup | click_os | click_country | click_region | click_referrer_type | category_id | created_at_ts | publisher_id | words_count | delta_timestamp | session_end | session_time | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2988171 | 2988171 | 81854 | 1508209728580941 | 2017-10-17 03:08:48 | 7 | 207280 | 2017-10-17 17:43:24.232 | 4 | 1 | 17 | 1 | 9 | 1 | 331 | 2017-10-17 11:43:43 | 0 | 291 | -1 days +23:43:41.644000 | 0 days 14:34:36.232000 | -978.0 |
| 2988172 | 2988172 | 81854 | 1508209728580941 | 2017-10-17 03:08:48 | 7 | 68786 | 2017-10-17 18:00:12.588 | 4 | 1 | 17 | 1 | 9 | 1 | 136 | 2017-10-17 14:40:28 | 0 | 197 | 0 days 00:16:48.356000 | 0 days 14:51:24.588000 | 1008.0 |
| 2988173 | 2988173 | 78814 | 1508209836324971 | 2017-10-17 03:10:36 | 3 | 32398 | 2017-10-17 03:41:51.765 | 4 | 3 | 2 | 1 | 25 | 1 | 26 | 2017-10-16 23:00:59 | 0 | 152 | -1 days +23:59:30 | 0 days 00:31:15.765000 | -30.0 |
| 2988174 | 2988174 | 193247 | 1508210308399099 | 2017-10-17 03:18:28 | 3 | 289453 | 2017-10-17 03:31:40.668 | 4 | 1 | 17 | 1 | 25 | 1 | 421 | 2017-10-16 16:29:59 | 0 | 211 | -1 days +23:59:30 | 0 days 00:13:12.668000 | -30.0 |
| 2988175 | 2988175 | 322856 | 1508210315294102 | 2017-10-17 03:18:35 | 3 | 237614 | 2017-10-17 03:18:46.972 | 4 | 3 | 2 | 1 | 21 | 7 | 375 | 2016-12-28 18:25:43 | 0 | 192 | -1 days +23:56:19.641000 | 0 days 00:00:11.972000 | -220.0 |
| 2988176 | 2988176 | 195186 | 1508210422411129 | 2017-10-17 03:20:22 | 4 | 2221 | 2017-10-17 03:21:09.562 | 4 | 3 | 2 | 1 | 1 | 1 | 1 | 2017-10-16 22:21:09 | 0 | 103 | -1 days +23:55:59.634000 | 0 days 00:00:47.562000 | -240.0 |
| 2988177 | 2988177 | 75658 | 1508210696185183 | 2017-10-17 03:24:56 | 4 | 271117 | 2017-10-17 03:29:11.703 | 4 | 1 | 17 | 1 | 4 | 2 | 399 | 2017-09-01 14:27:41 | 0 | 156 | 0 days 00:01:56.633000 | 0 days 00:04:15.703000 | 117.0 |
| 2988178 | 2988178 | 217129 | 1508210976336246 | 2017-10-17 03:29:36 | 2 | 20204 | 2017-10-17 03:29:50.810 | 4 | 3 | 2 | 1 | 21 | 5 | 9 | 2017-04-11 18:13:30 | 0 | 242 | 0 days 00:00:00 | 0 days 00:00:14.810000 | 15.0 |
| 2988179 | 2988179 | 217129 | 1508210976336246 | 2017-10-17 03:29:36 | 2 | 70196 | 2017-10-17 03:30:20.810 | 4 | 3 | 2 | 1 | 21 | 5 | 136 | 2017-04-04 09:34:55 | 0 | 206 | 0 days 00:00:30 | 0 days 00:00:44.810000 | 30.0 |
| 2988180 | 2988180 | 51099 | 1508211320193320 | 2017-10-17 03:35:20 | 2 | 98243 | 2017-10-17 03:43:02.523 | 4 | 3 | 2 | 1 | 25 | 5 | 220 | 2016-10-18 01:31:25 | 0 | 44 | -1 days +23:59:30 | 0 days 00:07:42.523000 | -30.0 |